Improving a Credit Scoring Model by Incorporating Bank Statement Derived Features

نویسندگان

  • Rory P. Bunker
  • Wenjun Zhang
  • M. Asif Naeem
چکیده

In this paper, we investigate the extent to which features derived from bank statements provided by loan applicants, and which are not declared on an application form, can enhance a credit scoring model for a New Zealand lending company. Exploring the potential of such information to improve credit scoring models in this manner has not been studied previously. We construct a baseline model based solely on the existing scoring features obtained from the loan application form, and a second baseline model based solely on the new bank statement derived features. A combined feature model is then created by augmenting the application form features with the new bank statement derived features. Our experimental results show that a combined feature model performs better than both of the two baseline models, and that a number of the bank statement derived features have value in improving the credit scoring model. As is often the case in credit scoring, our target data was highly imbalanced, and Naive Bayes was found to be the best performing classifier, outperforming a number of other classifiers commonly used in credit scoring. Future experimentation with Naive Bayes on other highly imbalanced credit scoring data sets will help to confirm whether the classifier should be more commonly used in the credit scoring context.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Hybrid Model for Credit Scoring (Case Study: Credit Clients of microloans, Bank Refah-Kargeran of Zanjan, Iran)

In any country, commercial banks lay the groundwork for economic growth by collecting national resources and capitals and allocating them to different economic sectors. Optimal allocation of resources is especially important in achieving this goal. Banks with an effective and dynamic system of customer assessment can efficiently allocate their resources to customers regardless of their geograph...

متن کامل

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

Personal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)

Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...

متن کامل

Credit rating of the bank legal customers by using the improved modified Russell model (Case study: the legal customers of Arak Melli Bank)

The most exchange volume in a country will be obtained through bank system whose correct function will have a determinant role in improving economic activities. Nowadays, the customer’s rating and accreditation subject has been considered more than before by the banks due to increase the volume of overdue claims and banks’ past over dues. One of the most important tools for controlling the bank...

متن کامل

The Use of Genetic Algorithm, Clustering and Feature Selection Techniques in Construction of Decision Tree Models for Credit Scoring

Decision tree modelling, as one of data mining techniques, is used for credit scoring of bank customers. The main problem is the construction of decision trees that could classify customers optimally. This study presents a new hybrid mining approach in the design of an effective and appropriate credit scoring model. It is based on genetic algorithm for credit scoring of bank customers in order ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.00252  شماره 

صفحات  -

تاریخ انتشار 2016